Universal patterns in sound amplitudes of songs and music genres 
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We report a statistical analysis over more than eight thousand songs. Specifically, we investigate 
the probability distribution of the normalized sound amplitudes. Our findings seems to suggest 
a universal form of distribution which presents a good agreement with a one-parameter stretched 
Gaussian. We also argue that this parameter can give information on music complexity, and conse- 
quently it goes towards classifying songs as well as music genres. Additionally, we present statistical 
evidences that correlation aspects of the songs are directly related with the non-Gaussian nature of 
their sound amplitude distributions. 

PACS numbers: 89.90.+n 89.20.-a 43.75. St 05.45.Tp 



-I— > 

a 

O 

^ : 



> 
o 



X 



In recent years, studies of complex systems have be- 
come widespread among the scientific community, spe- 
cially in the statistical physics one[ll-Q. Many of these 
investigations deal with data records ordered in time or 
space (i.e., time series), trying to extract some features, 
patterns or laws that may be present in the systems 
studied. This approach has been successfully applied 
to a variety of fields, from physics and astronomy [6j to 
genetics and economy Q. Moreover, this framework 
has been a trend towards investigating and modeling 
interdisciplinary fields, such as religion Q, elections (loj. 
vehicular traffic 11], tournaments [1 2j , and many others. 
These few examples and social phenomena in general 13 1 
illustrate as physicists have gone far from their tradi- 
tional domain of investigations. 

Music is a well known worldwide social phenomenon 
linked to the human cognitive habits, modes of conscious- 
ness as well as historical developments 14| . In the di- 
rection of music's social role, some authors investigated 
collective listening habits. For instance, Lambiotte and 
Ausloos[15| analyzed data from people music library find- 
ing audience groups with the size distribution following 
a power law. They also investigated correlations among 
these music groups, reporting non-trivial relations 16]. 
In another work, Silva et al.[T3] studied the network 
structure of the song writers and the singers of Brazil- 
ian popular music (mpb ). T here is also an interest in the 
behavior of music sales [18| as well as in the success of 
musicians [19l42l| . 

Despite these cultural aspects, songs form a highly or- 
ganized system presenting very complex structures and 
long-range correlations. All these features have attracted 
the attention of statistical physicists. In a seminal paper, 
Voss and Clarke [22| analyzed the power spectrum of ra- 
dio stations and observed a 1// noise like pattern. They 
also showed that the correlation can extend to longer 
or shorter time scales, depending on the music genre. 
Hsu and Hsu 23[ investigated the changes of acoustic 



frequency in Bach's and Mozart's compositions, finding 
self-similarity and fractals structures. In contrast, they 
report no resemblance to fractal geometry 2A\ for mod- 
ern music. Fractal structures have also been reported 
in the study of sequences of music notes [25j, where Su 
and Wu[2o| suggest that the multifractal spectrum can 
be used to distinguish different styles of music. By us- 
ing sound amplitudes of songs, Bigerelle and lost [27 1 
achieved a classification based on fractal dimension us- 
ing the entire frequency range. However, as raised by Ro 
and Kwon[28j|. the 1// analysis in the region below 20 Hz 
might not classify music genres. Giindiiz and Gunduz|29| 
reported analysis of several Turkish songs by using many 
techniques. Beltran del Rio et al.[3(| evaluated the rank 
distribution of music notes of a large selection finding a 
good agreement with a two parameter beta distribution. 
Dagdug et al. |31| investigated a specific piece of Mozart 
employing detrended fluctuation analysis (DFA) [3i| • Ap- 
plying DFA in a volatility-like series, Jennings et al. 33 1 
found quantitative differences in the Hurst exponent de- 
pending on the music genre. 

In this brief literature review, we see that special at- 
tention was paid to the fractal structures of music, cor- 
relations and power spectrum analysis. However, much 
less attention has been paid to the understanding of the 
amplitude distribution. This last point has been noted 
by Diodati and Piazza(34]]. In their work, they inves- 
tigated the distribution of times and sound amplitudes 
larger than a fixed value. By using this kind of return in- 
terval analysis [35], they found Gaussian distributions in 
the amplitude for jazz, pop, and rock music, while non- 
Gaussians emerge for classical pieces. Here, we directly 
investigate the amplitude distributions of songs of sev- 
eral genres without employing a threshold value as con- 
sidered by Diodati and Piazza. Moreover, our analysis 
goes towards finding patterns in the amplitude sound dis- 
tribution by using a suitable one-parameter probability 
distribution function (pdf). In the following, we present 
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FIG. 1: The normalized sound amplitude of (a) a classical 
piece and (b) a heavy metal song (labeled in the figure). Note 
that the signals are quite different, the first one presents a 
more complex structure characterized by "bursts" while the 
second resembles a Gaussian noise. 

the dataset used in our investigation, the analysis of the 
shape of the resulting distributions and our conclusions. 

Not all sound is music, but certainly music is made by 
sounds. The sounds that we hear are consequence of pres- 
sure fluctuations traveling in the air and hitting our ears. 
These audible pressure fluctuations can be converted into 
a voltage signal u t by using a record system and stored, 
for instance, in a compact disk (CD). Our analysis is fo- 
cused on this time series u t that we call sound amplitude. 
In the case of songs stored in CDs, u t has a standard sam- 
pling rate of 44.1 kHz and encompasses the full audible 
human range (approximately between 20 Hz and 20 kHz). 

As database we have 8115 songs of nine different 
music genres: classical (907), tango (992), jazz (700), 
hip-hop (876), mpb (580), flamenco (524), pop (998), 
techno (900) and heavy metal (1638). The songs 
were chosen so as to cover a large amount of com- 
posers/singers. For instance, for classical music, we 
have taken pieces from Bartok, Beethoven, Berlioz, 
Brahms, Bruch, Chopin, Dvorak, Faure, Grieg, Mal- 
her, Marcello, Mozart, Rachmaninov, Strauss, Schuber, 
Schumann, Scriabin, Shostakovich, Sibelius, Stravinsky, 
Tchaikovsky, Verdi, Vivaldi, and others. 

When a time series is analyzed, a way to view its vari- 
ability (complexity) is at least in part by investigating its 
pdf. In the case of music, the mean amplitude is approx- 
imately zero since a vibration essentially occurs around 
this value. In addition, the mean (global) intensity is not 
relevant to the variability (complexity) of a song. Mo- 
tivated by these facts, our research is based on the pdf 
of recorded data regardless of their mean value and their 
real amplitudes. In other words, we are considering that 
the complexity of a song is not related to its mean inten- 
sity but with the relative variability of the amplitudes. 
Thus, instead of employing the amplitude u t in differ- 



ent time instants i, we focus attention on ut subtracted 
from its mean value [i and divided by its standard de- 
viation a. This corresponds to using Zt = (lit — fx)/cr 
instead of u t . Figure [1] illustrates the behavior of z t for 
two songs, a classical piece and a heavy metal song. This 
figure is enough to reveal qualitative differences between 
these two songs. In the classical piece, we can observe 
some kind of bursts giving rise to a non- Gaussian distri- 
bution. However, for the heavy metal song, the signal is 
very similar to a Gaussian noise - no complex structure 
is perceptible. 

Motivated by these distinct behaviors, we investigate 
the distribution of z t for all the songs in our data set. In 
Figure [2] we show the pdf for some representative songs. 
As we can verify from this figure, the shape of distribu- 
tions goes from a long tail to Laplace to Gaussian distri- 
bution. A family of functions that has the Gaussian and 
the Laplace distributions as particular case is given by 
the stretched Gaussian (36| p(z) = Nexp(— b\z\ c ), where 
N is the normalization constant, b is directly related to 
the standard deviation and c is a positive parameter. 
Since the distribution p(z) is normalized to unity and the 
variable z is defined in such way that its standard devi- 
ation is equal to one, the parameters N and b become a 
function exclusively of c, leading to 

. c fT(3/c) V /2 / ^ r(3/c) V /2 , \ 

with T[w] being the Euler gamma function. Also in Fig- 
ure [H the least square fits to the data of the above func- 
tion are shown. Observe that we find a good agreement 
between the data and the model for the songs represented 
in this figure, and a similar agreement have been found 
for the others (at least in the central part of the distri- 
bution) . 

The only model parameter is c and it may give use- 
ful information about music complexity. First note that 
for values of c smaller than one heavy tail distribution 
emerge. In some sense, these heavy tails reflect the com- 
plex structures that we see in Figure [T^,, i.e., larger fluc- 
tuations. The increasing of c makes the tails shorter and 
recover some known distributions (Laplace for c — 1 and 
Gaussian for c = 2). In this context, a shorter tail in- 
dicates that larger fluctuations become rare, leading to 
music signal very similar to a Gaussian noise (see Figure 
[TJd). From the musical point of view, the word complex- 
ity may be related to several aspects of the song or even 
with music taste. In present context, it should be viewed 
a comparative measurement, i.e., a measure of how the 
empirical distributions differs from the Gaussian one. 

Based on the above discussion, we may use c to sort 
the songs and music genres in a kind of complexity or- 
der (smaller c is related to a large complexity). In order 
to construct this rank for music genres, we evaluate the 
mean value of c over all songs of each music genre consid- 
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FIG. 2: (color online) Histograms of some representative songs (labeled in the figure) in comparison with the stretched Gaussian 
Eq.|T]). The squares (circles) is the right (left) channel of the stereo audio. As we see, the two channels are quite similar in the 
sense that the statistical results do not dramatically change when considering the right or left channel. 
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FIG. 3: (color online) (a) In ascending order, the mean value of the parameter c corresponding to the stretched Gaussians 
employed for each music genre considered here, (b) The distribution of the parameter c for each genre, (c) Scatter plot of the 
parameter c versus the Hurst exponent, h, obtained via detrended fluctuation analysis (DFA) 39, 4(| of sound intensity zf. The 
dashed line is a guide for our eyes. 



ered here as shown in Figure f3]a. Our findings agree with 
other works in the sense that there is a quantitative dif- 
ference between classic and light/dancing music (33. 34 1. 
However, it is interesting to emphasize that music gen- 
res are not a well defined concept 38]. Thus, any taxon- 
omy may be controversial representing an open problem 
of automatic classification like other problems of pattern 
recognition. To take a glance in this complicated prob- 



lem we also evaluate the probability distribution of c for 
each music genre as shown in Figure [5b. We can see that 
there are overlapping regions for all genres, reflecting the 
fuzzy boundaries existent in the music genre definition. 

Despite the complex situation that emerges in the 
problem of automatic genre classification [4 l| - t44 1 . our 
model is very simple. From the qualitative point of view, 
the characteristic of songs and music genres is related 
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with multidimensional aspects like timbre, melody, har- 
mony, rhythm, among others. Thus, as a minimalist 
model, the classification presented here must be viewed 
as a king of global measure for these qualitative aspects. 
In addition, we have to note that correlation aspects are 
lost when we consider only histogram as presented in 
Figure [2] In the same way, information is also lost when 
someone considers only some correlations. However, we 
remark that the results concern to the genre classifica- 
tion, here obtained only by using the pdf of sound am- 
plitude, are in statistical agreement with other methods 
based on correlation analysis. This fact seems to suggest 
a kind of coupling between the correlation aspects and 
the non-Gaussian pdfs. Aiming to highlight this feature, 
we evaluated the Hurst exponent (h) of the time series zf 
and plot it versus the pdf parameter c in Figur^SJ;. The 
data presented in this figure suggest a approximated liner 
relation between c and h (Pearson correlation about -0.7), 
providing a statistical evidence that the non-Gaussian 
nature of the pdfs are directly related to the correlations 
in songs. Therefore, these two complementary aspects 
and others compose the multidimensional nature of music 
quantification and classification. 

Summing up, we investigated the probability distribu- 
tion of the normalized sound amplitudes for more than 
eight thousand musical pieces. The empirical findings 
seem to suggest a universal form of distribution which 
showed to be in good agreement with a stretched Gaus- 
sian. Due to the normalization and the standard devia- 
tion fixed as one, our distribution has only one parameter 
c. We argue that this parameter goes towards quantify- 
ing the complexity of songs as well as music genres. In 
addition to this universal feature, we presented empirical 
evidences that non-Gaussian nature of sound amplitude 
pdf are related to the correlation aspects. As an applica- 
tion, we also hope that the distribution of sound ampli- 
tudes presented here may have implications for stochastic 
music compositions. 
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